Automatic Semantic Classification of German Preposition Types: Comparing Hard and Soft Clustering Approaches across Features
نویسندگان
چکیده
This paper addresses an automatic classification of preposition types in German, comparing hard and soft clustering approaches and various windowand syntax-based co-occurrence features. We show that (i) the semantically most salient preposition features (i.e., subcategorised nouns) are the most successful, and that (ii) soft clustering approaches are required for the task but reveal quite different attitudes towards predicting ambiguity.
منابع مشابه
Exploring Soft-Clustering for German (Particle) Verbs across Frequency Ranges
In this paper we explore the role of verb frequencies and the number of clusters in soft-clustering approaches as a tool for automatic semantic classification. Relying on a large-scale setup including 4,871 base verb types and 3,173 complex verb types, and focusing on synonymy as a taskindependent goal in semantic classification, we demonstrate that low-frequency German verbs are clustered sign...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملDisambiguation of the Semantics of German Prepositions: a Case Study
In this paper, we describe our experiments in preposition disambiguation based on a – compared to a previous study – revised annotation scheme and new features derived from a matrix factorization approach as used in the field of distributional semantics. We report on the annotation and Maximum Entropy modelling of the word senses of two German prepositions, mit (‘with’) and auf (‘on’). 500 occu...
متن کاملSemantic Preserving Data Reduction using Artificial Immune Systems
Artificial Immune Systems (AIS) can be defined as soft computing systems inspired by immune system of vertebrates. Immune system is an adaptive pattern recognition system. AIS have been used in pattern recognition, machine learning, optimization and clustering. Feature reduction refers to the problem of selecting those input features that are most predictive of a given outcome; a problem encoun...
متن کاملDetection and Classification of Breast Cancer in Mammography Images Using Pattern Recognition Methods
Introduction: In this paper, a method is presented to classify the breast cancer masses according to new geometric features. Methods: After obtaining digital breast mammogram images from the digital database for screening mammography (DDSM), image preprocessing was performed. Then, by using image processing methods, an algorithm was developed for automatic extracting of masses from other norma...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016